-
Notifications
You must be signed in to change notification settings - Fork 500
Cost models for LookupCoin, ValueContains, ValueData, UnValueData builtins
#7344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
528ebcd to
69f1d6f
Compare
LookupCoin, ValueContains, ValueData, UnValueData builtins
53d9ea1 to
5b60cfc
Compare
5b60cfc to
7eebe28
Compare
kwxm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some initial comments. I'll come back and add some more later. I need to look at the benchmarks properly though.
plutus-core/plutus-core/src/PlutusCore/Evaluation/Machine/ExMemoryUsage.hs
Outdated
Show resolved
Hide resolved
plutus-core/plutus-core/src/PlutusCore/Evaluation/Machine/ExMemoryUsage.hs
Outdated
Show resolved
Hide resolved
plutus-core/plutus-core/src/PlutusCore/Evaluation/Machine/ExMemoryUsage.hs
Outdated
Show resolved
Hide resolved
b1a6bf1 to
6afef50
Compare
3cee663 to
86d645a
Compare
zliu41
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to benchmark the worst case, I think you should also ensure that lookupCoin always hits the largest inner map (or at least, such cases should be well-represented).
Also, we'll need to re-run benchmarking for unValueData after adding the enforcement of integer range.
plutus-core/cost-model/create-cost-model/BuiltinMemoryModels.hs
Outdated
Show resolved
Hide resolved
| @@ -12094,203 +12094,710 @@ IndexArray/42/1,1.075506579052359e-6,1.0748433439930302e-6,1.0762684407023462e-6 | |||
| IndexArray/46/1,1.0697135554442532e-6,1.0690902192698813e-6,1.0704133377013816e-6,2.2124820728450233e-9,1.8581237858977844e-9,2.6526943923047553e-9 | |||
| IndexArray/98/1,1.0700747499373992e-6,1.0693842628239684e-6,1.070727062396803e-6,2.2506114869928674e-9,1.9376849028666025e-9,2.7564941558204088e-9 | |||
| IndexArray/82/1,1.0755056682976695e-6,1.0750405368241111e-6,1.076102212770973e-6,1.8355219893844098e-9,1.5161640335164335e-9,2.4443625958006994e-9 | |||
| Bls12_381_G1_multiScalarMul/1/1,8.232134704712041e-5,8.228195390475752e-5,8.23582682466318e-5,1.224261187989977e-7,9.011720721178711e-8,1.843107342917502e-7 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub seeems to think that the data for all of the BLS functions has changed, but I don't think they have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file on master contains Windows-style line terminators (\r\n) for BLS lines:
git show master:plutus-core/cost-model/data/benching-conway.csv | grep "Bls12_381_G1_multiScalarMul/1/1" | od -c | grep -C1 "\r"
0000000 B l s 1 2 _ 3 8 1 _ G 1 _ m u l
0000020 t i S c a l a r M u l / 1 / 1 ,
0000040 8 . 2 3 2 1 3 4 7 0 4 7 1 2 0 4
--
0000200 8 7 1 1 e - 8 , 1 . 8 4 3 1 0 7
0000220 3 4 2 9 1 7 5 0 2 e - 7 \r \nThis PR changes \r\n to \n .
plutus-core/plutus-core/src/PlutusCore/Evaluation/Machine/ExMemoryUsage.hs
Outdated
Show resolved
Hide resolved
Add ValueTotalSize and ValueLogOuterSizeAddLogMaxInnerSize to the DefaultUni builtin type system, enabling these wrappers to be used in builtin function signatures. Both wrappers are coercions of the underlying Value type with specialized memory measurement behavior.
Add cost model parameters for four new Value-related builtins: LookupCoin (3 arguments), ValueContains (2 arguments), ValueData (1 argument), and UnValueData (1 argument). Updates BuiltinCostModelBase type, memory models, cost model names, and unit cost models. Prepares infrastructure for actual cost models to be fitted from benchmarks.
Apply memory wrappers and cost model parameters to Value builtin denotations. LookupCoin wraps Value with ValueLogOuterSizeAddLogMaxInnerSize, ValueContains uses the wrapper for container and ValueTotalSize for contained value. Replaces unimplementedCostingFun with actual cost model parameters. Updates golden type signatures to reflect wrapper types.
Add systematic benchmarking framework with worst-case test coverage: LookupCoin with 400 power-of-2 combinations testing BST depth range 2-21, ValueContains with 1000+ cases using multiplied_sizes model for x * y complexity. Includes R statistical models: linearInZ for LookupCoin, multiplied_sizes for ValueContains to properly account for both container and contained sizes.
Update all three cost model variants (A, B, C) with parameters fitted from comprehensive benchmark runs. Includes extensive timing data covering full parameter ranges for all four Value builtins. Models derived from remote benchmark runs on dedicated hardware with systematic worst-case test coverage ensuring conservative on-chain cost estimates.
Update test expectations across the codebase to reflect refined cost models: conformance test budgets (8 cases), ParamName additions for V1/V2/V3 ledger APIs (11 new params per version), param count tests, cost model registrations, and generator support. All updates reflect the transition from placeholder costs to fitted models.
Document the addition of fitted cost model parameters for Value-related builtins based on comprehensive benchmark measurements.
99d05eb to
37f29be
Compare
Fix bug where worst-case entry could be duplicated in selectedEntries when it appears at a low position in allEntries (which happens for containers with small tokensPerPolicy values). The issue occurred because the code took the first N-1 entries from allEntries and then appended worstCaseEntry, without checking if worstCaseEntry was already included in those first N-1 entries. For containers like 32768×2, the worst-case entry (policy[0], token[1]) is at position 1, so it was included in both the "others" list and explicitly appended, creating a duplicate. Value.fromList deduplicates entries, resulting in benchmarks with one fewer entry than intended (e.g., 99 instead of 100), producing incorrect worst-case measurements. Solution: Filter out worstCaseEntry from allEntries before taking the first N-1 entries, ensuring it only appears once at the end of the selected entries list.
Replace manual iteration + lookupCoin implementation with Data.Map.Strict's isSubmapOfBy, which provides 2-4x performance improvement through: - Parallel tree traversal instead of n₂ independent binary searches - Better cache locality from sequential traversal - Early termination on first mismatch - Reduced function call overhead Implementation change: - Old: foldrWithKey + lookupCoin for each entry (O(n₂ × log(max(m₁, k₁)))) - New: isSubmapOfBy (isSubmapOfBy (<=)) (O(m₂ × k_avg) with better constants) Semantic equivalence verified: - Both check v2 ⊆ v1 using q2 ≤ q1 for all entries - All plutus-core-test property tests pass (99 tests × 3 variants) - Conformance tests show expected budget reduction (~50% CPU cost reduction) Next steps: - Re-benchmark with /costing:remote to measure actual speedup - Re-fit cost model parameters (expect slope reduction from 6548 to ~1637-2183) - Update conformance test budget expectations after cost model update Credit: Based on optimization discovered by Kenneth.
Optimize generateConstrainedValueWithMaxPolicy to minimize off-path map sizes while maintaining worst-case lookup guarantees: 1. Sort keys explicitly to establish predictable BST structure 2. Select maximum keys (last in sorted order) for worst-case depth 3. Populate only target policy with full token set (tokensPerPolicy) 4. Use minimal maps (1 token) for all other policies Impact: - 99.7% reduction in benchmark value size (524K → 1.5K entries) - ~340× faster map construction during benchmark generation - ~99.7% memory reduction (52 MB → 150 KB per value) - Zero change to cost measurements (worst-case preserved) Affects: LookupCoin, ValueContains benchmarks Formula: totalEntries = tokensPerPolicy + (numPolicies - 1) Example: 1024 policies × 512 tokens = 1,535 entries (was 524,288) Rationale: BST lookups only traverse one path from root to leaf. Off-path policies are never visited, so their inner map sizes don't affect measurement. Reducing off-path maps from tokensPerPolicy to 1 eliminates 99.7% of irrelevant data without changing worst-case cost. Technical details: - ByteString keys already use worst-case comparison (28-byte prefix) - Sorting + last selection guarantees maximum BST depth (rightmost leaf) - Target policy still has full token set for worst-case inner lookup - Validates correct behavior: build succeeds, benchmarks run normally
…ization Update benchmark data and cost model parameters based on optimized valueContains implementation using Map.isSubmapOfBy. Benchmark results show significant performance improvement: - Slope: 6548 → 1470 (4.5x speedup in per-operation cost) - Intercept: 1000 → 1,163,050 (increased fixed overhead) The slope reduction confirms the 3-4x speedup observed in local testing. Higher intercept may reflect actual setup overhead in isSubmapOfBy or statistical fitting on the new benchmark distribution. Benchmark data: 1023 ValueContains measurements from GitHub Actions run 19367901303 testing the optimized implementation.
|
@Unisay I still need a summary of how the main recent discussion points were addressed (or why if not addressed), so reviewers know where to look. |
|
It would also be helpful if you reply to each unresolved comment above to indicate if it has been addressed or why it hasn't. |
Update benchmark results for ValueData/UnValueData/LookupCoin functions and regenerate builtin cost models A, B, and C with new CPU cost parameters based on latest GitHub Actions benchmarking data. The ValueData and UnValueData benchmark results have been replaced with updated measurements that reflect the current performance characteristics. Cost model CPU parameters adjusted accordingly while preserving memory cost models unchanged.
Update conformance test golden files to reflect new cost models after latest benchmark measurements. The optimized valueContains implementation and updated LookupCoin costs result in different CPU budget usage. All evaluation results remain correct - only budget expectations changed to match actual costs from updated builtinCostModelA/B/C.json files.
Replace byte-based limit (30,000 bytes = 416 entries) with a simple hardcoded limit of 100,000 entries based on execution budget constraints. Rationale: - Scripts can programmatically generate Values larger than ledger storage limits without storing them on-chain - The real constraint is CPU execution budget, not storage or memory - 100K entries is achievable within max execution budget while leaving room for actual script logic - Simpler implementation: direct entry count instead of byte-to-entry conversion This change will require re-benchmarking Value-related builtins: - LookupCoin - ValueContains - ValueData - UnValueData
Replace integer-based key generation with direct random byte generation as suggested in code review. This eliminates unnecessary bitwise operations while achieving the same worst-case key pattern (0xFF prefix + 4 random bytes). Benefits: - Simpler, more readable code - Removes unused Data.Bits import - Eliminates helper function mkWorstCaseKey - Same collision probability (~2^-32) - Same worst-case ByteString comparison behavior
Update cost parameters for ValueData and UnValueData builtins based on fresh benchmark runs. The ValueData constant cost decreased slightly (199831 → 194713) while UnValueData slope increased significantly (16782 → 43200), reflecting more accurate characterization of serialization costs across different Value sizes. Benchmark data shows updated timing measurements for 100 test cases covering various Value entry counts, improving cost model accuracy for on-chain script execution budgeting.
zliu41
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ValueData and UnValueData, you tested "randomValue 1 to 100_000 entries", but for LookupCoin and ValueContains, why is the max value size only 1448? Or is the description not up to date?
Otherwise LGTM - nice work!
kwxm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks basically OK (modulo a few things mentioned in the comments) and I've OK'd it so that we can merge it and make progress. However I want to think a bit more about the complexity (and benchmarking) of valueContains and I may come back with more comments later.
| , paramIndexArray = Id $ ModelTwoArgumentsConstantCost 32 | ||
| -- Builtin values | ||
| , paramLookupCoin = Id $ ModelThreeArgumentsConstantCost 10 | ||
| , paramValueContains = Id $ ModelTwoArgumentsConstantCost 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| , paramValueContains = Id $ ModelTwoArgumentsConstantCost 32 | |
| , paramValueContains = Id $ boolMemModel |
| -- Builtin values | ||
| , paramLookupCoin = Id $ ModelThreeArgumentsConstantCost 10 | ||
| , paramValueContains = Id $ ModelTwoArgumentsConstantCost 32 | ||
| , paramValueData = Id $ ModelOneArgumentConstantCost 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the memory models for valueData and unValueData need to be much bigger, since they're supposed to represent the total amount of memory used by the returned value. Experimenting with the results of generateTestValues in Benchmarks.Values, I got a list of Values with thefollowing memory usages:
[0,55539,12118,10211,45715,8631,25078,1706,13340,24360,17529,11374,7681,71229,7345,14258,9161,14034,1339,48068,23206,41314,6950,16799,15401,14397,349,6205,4611,28034,34924,9816,11709,36200,2539,6722,53631,22384,32041,60206,15751,6760,94287,12000,37360,10870,35535,9649,6938,3891,57221,23825,16219,51830,3712,29569,3065,50249,9171,82416,42921,32171,1899,58222,17522,32561,30366,1596,5008,17914,5177,10016,9206,7188,93911,63802,8962,13202,8621,13884,80,43194,8112,54225,1077,1036,45364,31703,1872,24615,48316,9248,40840,8876,344,18905,2591,19916,1295,10229,18246]
and converting these into Data gave a list of objects with the following memory usages:
[4,1388479,302906,255279,1142879,215659,626942,42498,333432,608992,438217,284330,192029,1780729,183629,356454,229017,350854,33443,1201704,580142,1032854,173682,419955,385029,359893,8729,155129,115051,700854,873092,245368,292645,904992,63479,168018,1340779,559592,801017,1505154,393779,168932,2357179,300004,934004,271754,888379,241205,173322,97279,1430529,595629,405455,1295754,92756,739229,76629,1256229,229279,2060404,1073029,804267,47095,1455554,438042,814017,759154,39904,125204,447830,129345,250404,230154,179608,2347779,1595054,224030,330030,215505,347092,1944,1079842,202804,1355629,26149,25268,1134104,792579,46768,615343,1207904,231168,1021004,221844,8292,472605,64647,497868,32379,255693,456130]
Zipping with div, it looks as if the memory usages of the Data objects are generally 24-25 times the memory usages of the corresponding Value objects, so on the face of it the memory model for valueData should probably multiply by 25 and the memory model for unValueData should probably divide by 25 (which we can't currently do). However this is misleading because the "memory usage" for a Value object is the totalSize,ie the total number of nodes in inner maps. We really want the total amount of memory occupied by the value, which will be approximated by something like (size of outer map) * (size of currency name ) + totalSize * (size of token name + size of quantity) (although we'll just be generating pointers to the existing names and quantities, not . Unfortunately we have to use the same size measure as the denotation, so that we can't feed the actualy memory usage to the memory costing function. If we look at the memory usage function for Data then we may be able to work out how it relates to the actual memory usage of the corresponding Value, and if we're lucky it may turn out that 25 times the number of nodes in the outer map. This will need a bit of investigation though.
| "arguments": 4, | ||
| "type": "constant_cost" | ||
| } | ||
| "addInteger": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation seemsto have changed in these files, which makes it tricky to see what the important differences are. Did they get reformatted by an editor or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an unintended change! I was planning to fix indentation in a separate PR...
| filtered <- data %>% | ||
| filter.and.check.nonempty(fname) %>% | ||
| discard.overhead () | ||
| m <- lm(t ~ I(x_mem * y_mem), filtered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this model may be inaccurate since we changed the implementation of valueContains. I'll think about that, but for the time being I think the predictions actually look pretty close to the benchmark results, so it should be safe to merge this so that we can move on, but come back to it later.
| } | ||
|
|
||
| # Sizes of parameters are used as is (unwrapped): | ||
| valueDataModel <- constantModel ("ValueData") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit mystified about why this is constant cost, but I think the benchmarks are doing the right thing and the results do in fact seem to be pretty constant. Maybe we could make in linearInX with a zero (or at least small) slope in case we have to change it later (it's safe to use a linear function to represent a constant one, but difficult to change from constant to linear later).
| -- Assume 64 Int | ||
| memoryUsageInteger i = fromIntegral $ I# (integerLog2# (abs i) `quotInt#` integerToInt 64) + 1 | ||
| -- Assume 64-bit words | ||
| memoryUsageInteger i = fromIntegral (integerLog2 (abs i) `div` 64 + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| , paramListToArray = Id $ ModelOneArgumentLinearInX $ OneVariableLinearFunction 7 1 | ||
| , paramIndexArray = Id $ ModelTwoArgumentsConstantCost 32 | ||
| -- Builtin values | ||
| , paramLookupCoin = Id $ ModelThreeArgumentsConstantCost 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is OK. It'll actually return a pointer to an already-allocated quantity in the heap and I think we've used 10 for that elsewhere. That's probably not totally accurate, but the numbers in here don't bear much of a realationship to reality anyway.
| 3. Include deepest entry to force maximum BST traversal | ||
| 4. Test multiple contained sizes to explore iteration count dimension | ||
| Result: ~1000 systematic worst-case benchmarks vs 100 random cases previously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is maybe a bit too much. It takes about 2½ hours to benchmark just this function, which is much longer than any of the other builtins. Here's a list of the numbers of datapoints in the CSV file for the most intensively benchmarked builtins, and valueContains is much bigger than anything else. Maybe we could reduce it to 15x15 or something.
202 EqualsString
225 AddInteger
256 DivideInteger
256 ExpModInteger
256 MultiplyInteger
300 EqualsByteString
400 ConstrData
400 EqualsData
400 LookupCoin
400 MkPairData
400 SerialiseData
441 AppendByteString
441 AppendString
500 ChooseList
625 AndByteString
1052 ValueContains
| numEntries <- uniformRM (1, maxValueEntries) g | ||
| generateValueMaxEntries numEntries g | ||
|
|
||
| -- | Maximum number of (policyId, tokenName, quantity) entries for Value generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are a bit big. There's a danger that if you benchmark with very large inputs it'll be inaccurate for smaller (and more realistic) ones. If a function is constant cost then it doesn't matter too much what the input sizes are, and for linear costing functions we can maybe trade a bit of inaccuracy for bigger numbers in favour of accurate costing for smaller ones. The current CPU costing function for unValueData is 1000 + 43200*(total size), which grows pretty quickly.
|
|
||
| valueContainsArgs :: StdGen -> [(Value, Value)] | ||
| valueContainsArgs gen = runStateGen_ gen \g -> do | ||
| {- ValueContains performs multiple LookupCoin operations (one per entry in contained). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is no longer accurate: it's not always searching from the root of the containing value. I'll try to say more about this later.
|
|
||
| lookupCoinArgs :: StdGen -> [(ByteString, ByteString, Value)] | ||
| lookupCoinArgs gen = runStateGen_ gen \(g :: g) -> do | ||
| {- Exhaustive power-of-2 combinations for BST worst-case benchmarking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is covering the worst case. The attached plot shows the raw benchmark results for lookupCoin with a regression line fitted. Above every size there are a number of points that take different times, so I think this is benchmarking the average case. Ideally we'd just get the top point of each column of points and fit a line through those. However, it probably doesn't matter too much. The vertical columns look quite big, but in fact the difference from the regression line is only about 3-4%, which seems pretty acceptable.
|
Maybe some of the stuff about the benchmarking strategy in the intiial PR comment could go in the file containing the benchmarks so that we can find it when we look at the file in a few years and wonder why the benchmarks are like they are. I think there's some overlap with the existing comments, but there's stuff that I don't think is covered in the file. |
Costing Value Builtins with Worst-Case Benchmarking
Overview
This PR implements costing for four Plutus Core Value builtins:
LookupCoin,ValueContains,ValueData, andUnValueData. The implementation uses a worst-case oriented benchmarking strategy that ensures conservative cost estimates for adversarial on-chain scenarios.Values in Plutus Core are implemented as nested Maps:
Map PolicyId (Map TokenName Quantity), backed by BST-based Data.Map. The benchmarking approach systematically explores BST worst-case behavior through careful test case generation.Cost Models by Builtin
LookupCoin
Cost Model Type:
linear_in_z(linear in sum of logarithms)intercept + slope × (log(outerSize) + log(maxInnerSize))Size Measure:
ValueLogOuterSizeAddLogMaxInnerSizelog₂(numPolicies) + log₂(maxTokensPerPolicy)Rationale: Looking up a coin requires traversing the outer BST to find the policy, then traversing the largest inner BST to find the token. The sum of logarithms accurately models the total comparison cost.
ValueContains
Cost Model Type:
multiplied_sizes(product of dimensions)intercept + slope × container_log_size × contained_total_entriesSize Measures:
ValueLogOuterSizeAddLogMaxInnerSize(same as LookupCoin)ValueTotalSize(total number of entries)Rationale: ValueContains performs one LookupCoin operation per entry in the contained Value. The cost is the product of:
Result: O(n₂ × (log m₁ + log k₁)) complexity.
Implementation Note: Uses
Map.isSubmapOfBywith optimized short-circuiting, providing 2-4x speedup over naive iteration.ValueData
Cost Model Type:
constant_costSize Measure: Raw Value (no wrapper needed)
Rationale: Wrapping a Value as Plutus Data is a constant-time pointer operation. The Data structure already exists in memory; valueData just changes the type tag. Benchmarks confirm minimal variance across Value sizes.
UnValueData
Cost Model Type:
linear_in_x(linear in Data size)intercept + slope × data_sizeSize Measure: Standard Data size (built-in)
Rationale: Deserializing Data to Value requires traversing the Data structure and validating the nested map structure. Cost scales linearly with Data size. The slope (43,200 steps per Data node) reflects validation overhead.
ExMemoryUsage Newtypes: Size Measure Logic
ValueLogOuterSizeAddLogMaxInnerSize
Purpose: Models worst-case BST traversal depth through nested maps.
Key Insight: For a Value with m policies where the largest policy has k tokens, worst-case lookup requires:
Why sum, not max?: Experimental benchmarks showed lookup time scales linearly with the sum of depths. Both traversals must complete; they're not alternatives.
ValueTotalSize
Purpose: Counts total number of (policyId, tokenName, quantity) entries across all policies.
Usage: Measures iteration count for operations like ValueContains that must check every entry in the contained Value.
Worst-Case Benchmarking Strategy
The benchmarking methodology prioritizes conservative cost estimates through systematic worst-case generation.
1. Worst-Case BST Keys
Problem: Random ByteString keys typically differ in the first 1-2 bytes, making BST comparisons artificially cheap (short-circuit after 1-2 byte comparisons).
Solution: Generate keys with a common prefix:
Result: Forces full 32-byte comparisons during BST traversal, reflecting adversarial scenarios where an attacker crafts keys to maximize comparison cost.
2. Power-of-2 Size Grid
Approach: Test all combinations of sizes from the sequence:
This sequence includes both powers of 2 (2ⁿ) and geometric means (2^(n+0.5) ≈ 2ⁿ × √2).
Coverage:
Rationale: Power-of-2 sizing systematically explores different BST depths. The half-powers provide finer granularity between powers, ensuring no "gaps" in depth coverage.
3. Maximum Depth Targeting
For each Value generated, we track the deepest entry (rightmost in both outer and inner BSTs):
Key Optimization: Only the max policy receives all tokens. Other policies get a single token each. This minimizes "off-path" costs while maximizing depth at the target lookup location.
Lookup Keys: Benchmarks always query
(maxPolicyId, deepestToken), forcing maximum BST traversal depth.Benchmark Generation by Builtin
LookupCoin: Exhaustive Depth Coverage
Result: 400 test points systematically covering all depth combinations from (2,2) to (21,21).
Lookup Strategy: Every benchmark queries the deepest possible entry in the Value's BST structure.
ValueContains: Subset with Worst-Case Entry
Contained Value Construction:
containedSize - 1arbitrary entriesCritical Detail: Placing the deepest entry last ensures:
Result: ~1000 systematic test cases exploring both container depth and iteration count dimensions.
ValueData & UnValueData: Random Distribution
Strategy: Random sampling with uniform distribution across:
Maximum Size: 100,000 entries (up from original 416), reflecting execution budget constraints rather than ledger storage limits.
Rationale: Scripts can programmatically generate Values much larger than on-chain storage allows. The 100K limit represents what's achievable within maximum CPU execution budget (~10-15 billion picoseconds) while leaving room for actual script logic.
Constant vs Linear Models: ValueData shows constant cost (pointer wrapping), while UnValueData shows linear cost (structural validation), confirmed by benchmarks across this wide size range.
Performance Impact
The
valueContainsimplementation received a 2-4x speedup optimization:Before: Manual iteration with early exit
After: Native Map.isSubmapOfBy
Benefit: Leverages optimized Map internals with better short-circuiting and comparison batching. Cost model updated to reflect improved performance.
Testing & Validation
Visualization
Interactive cost model visualizations available at:
https://plutus.cardano.intersectmbo.org/cost-models/
To preview this PR's cost models, configure the data source to load from this branch:
/cost-models/valuecontains/)https://raw.githubusercontent.com/IntersectMBO/plutus/yura/costing-builtin-value/plutus-core/cost-model/data/benching-conway.csvhttps://raw.githubusercontent.com/IntersectMBO/plutus/yura/costing-builtin-value/plutus-core/cost-model/data/builtinCostModelC.jsonAvailable visualizations:
lookupCoin,valueContains,valueData,unValueDataSummary
This PR establishes production-ready costing for Value builtins through:
The worst-case focus—common-prefix keys, maximum-depth lookups, systematic size coverage—provides strong safety guarantees for on-chain execution budgeting.